Skip to main content
TrustRadius
Azure AI Speech

Azure AI Speech
Formerly Azure Cognitive Speech Services

Overview

What is Azure AI Speech?

The Azure AI Speech service provides a range of speech recognition and generation capabilities including speech transcription, text-to-speech and speech translation. It provides a range of speech recognition and generation capabilities including speech transcription, text-to-speech, speech translation, and speaker recognition.

Read more
Recent Reviews
Read all reviews

Awards

Products that are considered exceptional by their customers based on a variety of criteria win TrustRadius awards. Learn more about the types of TrustRadius awards to make the best purchase decision. More about TrustRadius Awards

Return to navigation

Pricing

View all pricing

Entry-level set up fee?

  • No setup fee
For the latest information on pricing, visithttps://azure.microsoft.com/en…

Offerings

  • Free Trial
  • Free/Freemium Version
  • Premium Consulting/Integration Services

Starting price (does not include set up fee)

  • $1 per month
Return to navigation

Product Details

What is Azure AI Speech?

The Speech service is the unification of speech-to-text, text-to-speech, and speech-translation into a single Azure subscription. It's speech capabilities enable applications, tools, and devices with the Speech CLI, Speech SDK, Speech Devices SDK, Speech Studio, or REST APIs.

Services include:

Speech to Text - Transcribe audio in more than 92 languages and variants. Gain customer insights with call center transcription, improve experiences with voice-enabled assistants, and capture key discussions in meetings.

Text to Speech - Create apps and services that speak conversationally, choosing from more than 215 voices, and 60 languages and variants. Create natural-sounding audio content, improve accessibility with read-aloud functionality, and create custom voice assistants.

Speech Translation - Translate audio from more than 30 languages and customize translations for organization's specific terms in a preferred programming language.

Speaker Recognition - Confirm a person's identity or recognize who's speaking in a meeting by adding speaker verification and identification to an app.

Custom Commands - Users can build a touchless, voice-first experience to improve safety and support back-to-work scenarios.

Custom Keywords - Custom keyword for IoT devices and voice-enabled assistants to set your brand apart—making it more personal, personable, and secure.

Azure AI Speech Technical Details

Deployment TypesSoftware as a Service (SaaS), Cloud, or Web-Based
Operating SystemsUnspecified
Mobile ApplicationNo

Frequently Asked Questions

The Azure AI Speech service provides a range of speech recognition and generation capabilities including speech transcription, text-to-speech and speech translation. It provides a range of speech recognition and generation capabilities including speech transcription, text-to-speech, speech translation, and speaker recognition.

Azure AI Speech starts at $1.

The most common users of Azure AI Speech are from Enterprises (1,001+ employees).
Return to navigation

Comparisons

View all alternatives
Return to navigation

Reviews and Ratings

(16)

Attribute Ratings

Reviews

(1-5 of 5)
Companies can't remove reviews or game the system. Here's why
Score 8 out of 10
Vetted Review
Verified User
Incentivized
There are two main uses for this product within our organisation as of yet, firstly: we use the accurate voice analysis with custom speech models in lectures to ensure our lectures are accessible to students with hearing-related accessibility issues, mostly through live text translation. Secondly, students are able to use this service and integrate its functionality into their application development during projects within their computing degrees.
  • It implements accurate voice analysis which can be improved with customised speech models
  • Affordable
  • Doesn't have to be run online/ can be run and stored locally
  • It can be quite difficult to set up
  • Speech recognition is occasionally inaccurate
  • It sometimes struggles with non-native English speakers' accents
This service is well suited for scenarios where you need to integrate text-to-speech and/or speech-to-text into applications. Within our organisation, it is primarily used by students for development purposes to enable said functionality but is also used to provide accessibility to students who have hearing-related issues. Its multi-language support is also beneficial for our international students who have English as a second language and are therefore able to rapidly translate any text or speech that they do not understand.
  • Accurate speech detection and transcription
  • Live speech detection functionality
  • Easy deployment
  • Increased accessibility of our lectures for students
  • Reduced the time required by lectures to introduce CC captions to remote lectures during the COVID-19 pandemic
Having used both this service and IBM Watson's Text to Speech, I can safely say that IBM's product comes out on top but this is a close call as both products are very good in their own right. That being said, this Azure service lacks some of the extra functionality that can be found in other products such as broader multi-language support. This product is also more costly than some other alternatives which is a con in my opinion. Azure does, however, come out on top in regards to customer support and general support of the product as it is supported by Microsoft which also means that it integrates well with other parts of the Microsoft suite.
Johnson Martins | TrustRadius Reviewer
Score 9 out of 10
Vetted Review
Verified User
Simplicity on the initial implementation of Azure Cognitive Speech Services is a big plus. The features' flexibility is very unique and customizing any function is simple. The software reaches with powerful tools with effective voice recognition ability and easy to manage record and other business data management through Cloud services, and even the engagement functions and also predictive data analytics from this solution are the best.
  • Supportive data integration functions.
  • Simple adaptation to all functionalities.
  • I really love the speed of data migration with this platform.
  • The initial training when new to this software is an essential process.
  • Tracking a huge amount of recording history.
  • Collective multiple reports and evaluation is a turf operation.
Ease of the functionalities and the best solution on customer services management and intelligent ability on multiple voice recognition with Azure Cognitive Speech Services are very helpful. Also, the reporting functions manipulation is great and excellent experience managing multiple contacts and the effective data analytics functions offers results in real-time data.
  • Lead and contacts management functions.
  • Reports tools performance is nice.
  • Data connectors options are very responsive.
  • Functional engagement tools.
  • The platform provides solutions for project information and easy management of client contacts.
  • Reliable tools for quick reporting and the predictive data offered are quite relevant.
  • Multiple data migration and easy to schedule through the platform.
Azure Cognitive Speech Services is simple and the interface is not complicated even for those getting started with these customer services tools and the best voice recognition. Setting the platform dashboard preferences is also an easy process and with the ability to manage workflow and document management the system functions are stable and effective.
Score 8 out of 10
Vetted Review
Verified User
Incentivized
We use Azure Cognitive Speech Services to add speech to text, text to speech, and other AI-driven NLP-related speech services to our customised applications esp those involving chatbots for different business functions. The idea was to make use of speech services for mobile apps to make them hands-free and more accessible. The range of languages helped especially from an Indian context as only one competitor product could support as many Indian languages apart from a few European and middle eastern ones.
  • APIs offered are very robust.
  • Languages supported is far greater than most of its competitors.
  • Integration with our custom apps was easy.
  • Speech models that we created using neural voices were quite impressive.
  • Translation services worked really well.
  • Built in machine learning opens it to a lot more business use cases for the future.
  • At times different accents can be an issue but over time with more data, this can be further improved esp with reinforcement learning.
  • Price is on the higher side so ROI is slow to realise.
  • For community development, perhaps some of its source code could be open-sourced for further engagement and development as the overall community is small.
Excellent for voice enabled apps
built in security so speech data does not go outside
Flexible deployment on the cloud
Speech translation in real time scenarios
Using customised keywords to activate IoT devices
  • Text to speech.
  • Speech to text.
  • Translation APIs.
  • Customizable keywords.
  • Integration with 3rd party apps.
  • Ease of deployment on the cloud.
  • Although it takes time our apps powered by speech services gave us good ROI.
  • Made our products stand out in finance, hr and operations functions.
  • Gave much-needed AI-powered machine learning integration through NLP offered by azure.
  • Our chat assistants became more user friendly and thus UX increased.
Price is the number one factor which stands out for Azure among its competitor's Number of languages supported esp from an Indian context also is quite remarkable as opposed to its competitors, the vocabulary and accent support therein also matters. Its cloud-first deployment strategy also makes app deployment very easy.
Google Drive, IBM Cognos Analytics with Watson, Automation Anywhere, Microsoft 365 (formerly Office 365), Sophos Intercept X, Jira Software, VMware Blockchain, Broadcom Test Data Manager (formerly CA Test Data Manager)
No
  • Price
  • Product Features
  • Product Usability
  • Product Reputation
Price. Being a relatively newer technology, speech services with additional features are expensive. Hence we found Azure Cognitive Speech Services at least in our budget. Us being office users also helped, relationship-wise.
We may evaluate more on our products integrating with product demos to save time instead of 3rd party products.
Excellent knowledgeable sales team.
After sales support was fantastic
Price
Support package.
APIs
Future version upgrades.
Be honest, take them through your vision and goals, be realistic and practical. Be very upfront about budgets and RoIs envisaged.
Score 10 out of 10
Vetted Review
Verified User
Incentivized
We have been using chatbots within our organisation for several years. Our users have been asking whether it is possible to have a simulated 'voice conversation' with a chatbot (i.e., the user speaks into their microphone, which is converted into text and passed to the chatbot, which returns a text response which is synthesised into speech). We have recently been using Azure Cognitive Speech Services to handle speech-to-text and text-to-speech elements of interacting with a chatbot.
  • Accurate speech-to-text
  • Realistic 'voice' when using text-to-speech
  • Customisable 'voices' for text-to-speech
  • Occasionally, words in text-to-speech are not pronounced correctly
  • Sometimes the speech recognition is inaccurate
  • We have many non-native English speakers in our organisation, and the speech recognition occasionally struggles to understand certain words spoken in different accents
It is well suited for scenarios where there is a requirement to integrate speech-to-text and text-to-speech into user interaction, for example, with chatbots used internally at a large enterprise. We have also investigated the use of Azure Cognitive Speech Services for live captions during meetings and presentations and the additional translation of these captions from English into German.
  • Accurate speech recognition
  • Realistic synthesized text-to-speech 'voices'
  • Ability to translate speech and text from English to German and vice-versa
  • Live captions during meetings and presentations
  • Positive user response
  • Increased usage of internal enterprise chatbots
  • Improved accessibility and inclusivity for sight-impaired users
Azure Bot Service (Microsoft Bot Framework), Microsoft Teams, Microsoft Azure
Score 8 out of 10
Vetted Review
Verified User
Incentivized
We used it for a POC where we had to convert speech recordings from customers calling at our helpline to text. These text scripts were to be used for training and doing an analysis on customer sentiments. Azure cognitive speech services were used to convert speech to text. The scope of the use case was extended to analyze all customer conversations calling for inquiries and support.
  • Deployment is easy since its available on the cloud.
  • It is directly as a service and no expertise in AI or ML is needed by the development team.
  • Security of data since Azure promises that it does not store the data of the customers that is used by the service.
  • More support for India regional languages and the ability to interpret Indian dialect.
  • More detailed documentation with more coded examples to be available.
Azure Cognitive Speech Services is well suited for scenarios where you need real-time or batch-based data conversion - either from speech to text or text to speech, It can be used to interpret and document customer conversations or employee conversations or to make specific training programs. It can also be used to make and train avatars to read from a text document. It can also be made to use for cases where it can read for employees with special needs.
  • Security and privacy of client's data - this is most important.
  • Support for multiple languages available.
  • Support from Azure and its partner ecosystem.
  • It has improved productivity of sales training program by 9%.
  • It has reduced manpower at helpline by a significant amount.
Yellow Messenger didn't have Speech to text capabilities. Azure Cognitive Speech Services was specifically selected because it had the capabilities of text to speech and speech to text both with the added convenience of it being deployed on the cloud. The integration and use were also easy.
Return to navigation